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ABSTRACT 

We propose a distributed algorithm for sparse signal recov- 
ery in sensor networks based on Iterative Hard Thresholding 
(IHT). Every agent has a set of measurements of a signal x, 
and the objective is for the agents to recover x from their col- 
lective measurements at a minimal communication cost and 
with low computational complexity. A naive distributed im- 
plementation of IHT would require global communication of 
every agent's full state in each iteration. We find that we can 
dramatically reduce this communication cost by leveraging 
solutions to the distributed lop-K problem in the database lit- 
erature. Evaluations show that our algorithm requires up to 
three orders of magnitude less total bandwidth than the best- 
known distributed basis pursuit method. 

Index Terms — compressed sensing, distributed algo- 
rithm, iterative hard thresholding, top-_ftT 

1. INTRODUCTION 

In compressed sensing, a sparse signal x € R N is sampled 
and compressed into a set of M measurements, where M is 
typically much smaller than N. If these measurements are 
taken appropriately, then it is possible to recover x from this 
small set of measurements 1 1 1. 

Compressed sensing is an appealing approach for sen- 
sor networks, where measurement capabilities may be limited 
due to both coverage and energy constraints. Recent works 
have demonstrated that compressed sensing is applicable to 
a variety of sensor networks problems including event detec- 
tion j2 1, urban environment monitoring |3| and traffic estima- 
tion J4). In these applications, measurements of the signal 
are taken by sensors that are distributed throughout a region. 
The measurements are then collected at a single fusion center 
where signal recovery is performed. Due to limits in band- 
width, storage, and computation capabilities, it may be more 
efficient, and sometimes even necessary, to perform signal re- 
covery in the network in a distributed fashion. 
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Distributed solutions for compressed sensing have begun 
to receive attention lately. For example, one work proposes a 
distributed subspace pursuit recovery algorithm for a mixed- 
support set model J3J. This work assumes that every agent 
knows the sensing matrix of every other agent. The need for 
global knowledge of these matrices presents a scalability bot- 
tleneck as individual sensors do not have the capacity to store 
and process a large number of these matrices. Several works 
have proposed distributed basis pursuit algorithms for sparse 
signal recovery in sensor networks where the measurement 
matrices are not globally known |6|[7][8). In these algorithms, 
agents collaborate to solve a convex relaxation of the original 
recovery problem. Each agent stores its own estimate of the 
signal x, and, in each iteration, it updates this estimate based 
on communication with its neighbors in the network. This 
approach requires that every agent solve a local convex op- 
timization problem in each iteration. While these algorithms 
use only local communication, each agent must send its entire 
estimate vector to every neighbor in every iteration. This vec- 
tor is not necessarily sparse until the algorithm converges, and 
therefore, the messages can be quite large. As a result, these 
algorithms have a large total bandwidth cost. Furthermore, 
simulations show that this bandwidth cost increases dramati- 
cally as the network connectivity increases. 

We propose an alternative approach to distributed sparse 
signal recovery in sensor networks that is based on Iterative 
Hard Thresholding (IHT) (9)- In our distributed implementa- 
tion of IHT, which we call D-IHT, all agents store identical 
copies of the estimate of x. In each iteration, every agent first 
performs a local computation to derive an intermediate vector. 
The agents then perform a global computation on their inter- 
mediate vectors to derive the next iterate. A naive distributed 
implementation of IHT would require global communication 
of the intermediate vector of each agent in every iteration. We 
find that we can dramatically reduced the communication cost 
of this global computation by leveraging solutions to the dis- 
tributed top-K problem in the database literature lfT0l[TTl[T2l . 
Our evaluations show that D-IHT requires up to three orders 
of magnitude less total bandwidth than the best-known dis- 
tributed basis pursuit method. D-IHT is also computationally 
simpler since it does not require that agents solve local con- 
vex optimization problems. While, in this work, we present 



our distributed recovery algorithm for compressed sensing, 
we note that our solution easily generalizes to sparse signal 
recovery from nonlinear measurements fl3l . 

The remainder of the paper is organized as follows. In 
Section |2j we detail our problem setting and formulation and 
provide a brief description of IHT. The D-IHT algorithm is 
presented in Section [3] Section [4] gives numerical results on 
the performance of D-IHT. 

2. PRELIMINARIES 
2.1. Problem Formulation 

We consider a set of P agents that form a connected, undi- 
rected static network topology with E edges. The agents 
may be the sensors themselves or they may be fusion nodes 
that collect measurements from several nearby sensors. Every 
agent knows the number of agents P, and we assume there is 
a unique agent identified as agent 1 . If the uniquely identified 
agent is not defined a priori, one can be chosen using a variety 
of well-known distributed algorithms (see fl4\ ). Agents com- 
municate with their neighbors in the network using fixed size 
messages. Messaging is reliable but asynchronous, meaning 
that every message that is sent is eventually delivered, but the 
delay between sending and delivery may be arbitrarily long. 

There is a X-sparse signal x 6 Tt N that the agents seek 
to estimate. Each agent p = 1 . . . P has M p > (pos- 
sibly noisy) measurements of x that have been taken using 
the agent's sensing matrix A p € H ai p xN . There are M — 
Mi + . . .+Mp measurements in total. The measurement vec- 
tor of agent p, denoted b p , is given by b p = A p x + e p , where 
e p E H Mp is the measurement error for agent p. Agents 
do not know the sensing matrices or measurement vectors of 
other agents. 

Our goal is for every agent to recover the same signal x 
from their collective measurements at a minimal communica- 
tion cost. Let b be the vector of all measurements, and let A 
be the sensing matrix for the entire system: 



b := 



A := 



A 1 



A 1 



To recover x from A and b, the agents must solve the follow- 
ing optimization problem, 

x = arg min IIAc— &||| subject to llxllo < K, (1) 

x<£R N 

where || • || denotes the l norm, i.e, the number of non-zero 
components. 

This problem is known to be NP-Hard in general 03). 
However, for suitable A matrices, efficient centralized algo- 
rithms to recover x exist. Our distributed solution is based on 
IHT J}6 9], which we describe next. 



2.2. Iterative Hard Thresholding Algorithm 

IHT is a gradient-like, iterative algorithm for finding a K- 
sparse vector a; in a centralized setting where A and b are 
known. Let T K (v) be the thresholding operator which re- 
turns a vector where all but the K entries of v with the largest 
magnitude are set to (with ties broken arbitrarily). The IHT 
algorithm begins with an initial, arbitrary X-sparse vector xq. 
In each iteration, a gradient-step is performed, followed by 
application of the thresholding operator: 



x t +i 



T K (x t - aA T (b - Ax t )) . 



(2) 



It has been shown that, for a < 1/ (2X max (A T A)), IHT con- 
verges to a local minimum of ([TJ I9.1 fl3l . 

We note that, even if A satisfies the properties necessary 
to enable recovery using IHT, it is not necessary and, in fact, 
not likely that each A p satisfies these properties. Therefore it 
is not possible for any single agent to recover x on its own; 
agents must exchange information with one another to per- 
form the recovery. In the next section, we present D-IHT, our 
distributed implementation of IHT. 

3. DISTRIBUTED ITERATIVE HARD 
THRESHOLDING 

In D-IHT, every agent stores an identical copy of xt, which 
is initially 0. In each iteration t, each agent first performs 
a local computation to derive an intermediate vector z v t £ 
R w . The agents then perform a global computation on their 
intermediate vectors to derive the next iterate Xt+%, which is, 
again, identical at every agent. We now define these local and 
global computations. 

Local computation. Each agent p computes a local residual 
vector, yf :— b p — A p xt- The intermediate vector for agent p, 
denoted z P , is then computed as follows, 



x t -a(A p ) T y P ifp = l, 
—a (A P ) T y\ otherwise. 



(3) 



Note that each agent can compute zf using its local informa- 
tion. 

Global computation. In the global computation step, all 
agents must compute a function G that depends on all of their 
intermediate vectors. This function is defined as follows, 



Xt+l 



G(zj, 



(4) 



We note that the combination of the local computation step 
^ and the global computation step Q are equivalent to (|2]). 

A naive implementation of G is for all agents to collabo- 
rate to compute all N sums, one for each component of the in- 
termediate vectors. Then, each agent can independently deter- 
mine the values with the K largest magnitudes. This approach 
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(a) The vector zf and the resulting sorted list L p , at three agents. 



Fig. 1: Example execution 
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(b) Steps of TA to find top two objects. After five steps, the threshold r 
is 65, and objects 3 and 4 both have sums greater than r. No remaining 
objects can have a sum greater than r. Therefore, the top two objects 
have been found, and the algorithm terminates. 



the TA algorithm for K = 2. 



requires communication of all components of all intermediate 
vectors and is thus very costly with respect to bandwidth. We 
derive a more communication efficient approach by leverag- 
ing work in the database literature on the distributed top-A 
problem. We describe this problem and a popular solution 
We then present our distributed algorithm for 



in Section 3.1 



computing G in Section 3.2 



3.1. The Distributed Top-A Problem 

In the distributed top-A problem, each agent p has a list L p of 
pairs (o, val p (o)), where o is an object ID and val p (o) is the 
value of object o at agent p. In our recovery problem, this list 
is generated from the vector z P ; o is the index into the vector 
zf and val p (o) is the value of zf at index o. Each object 
has a score; in our case, the score is the magnitude of the 
sum of object's values at all agents. The objective is to find 
the objects with the K largest scores. Clearly, it is possible to 
solve this problem by computing the sum for every object and 
then selecting the objects with the K largest magnitude sums, 
but it is often not necessary to compute all sums in order to 
find the top-A objects. 

The Threshold Algorithm (TA) is a solution for the dis- 
tributed top-A problem that is instance optimal, i.e., TA 
makes the minimum number of sum computations necessary 
for a given input of lists iflOl ITTl [T2ll . As it was originally 
proposed, TA requires that all values be non-negativqj Here, 
we present the algorithm, assuming that this is the case. In 



Section 3.2 we explain how we modify TA to support both 



negative and non-negative values. 

In TA, each agent first sorts its list by object value, in 
descending order. One agent acts as the leader, requesting in- 
formation from the other agents, computing sums for objects, 
and distributing the final top- A" list to all agents. The algo- 
rithm proceeds as follows. The leader requests an object/score 
pair from each agent's list in sorted order, one pair from one 
agent in any given step. When the leader receives a pair con- 
taining an object it has not yet seen, it requests the value for 



'More precisely, TA requires that that an object's score is given by a func- 
tion that is monotonic in the values. 



Algorithm 1: Pseudocode for D-IHT algorithm, 
t initialize 



< 







2 

3 t 4- 

4 while true do 

5 zf 4— value from equation |3| Local computation. 

6 L p <— Sort(zf) Create sorted list from zf. 



t+i 



data(L p ) 



Global computation. 



t + 1 



that object from all other agents and computes the object's 
sum. The leader always stores the objects with the A largest 
sums it has seen so far. In addition, the leader stores the value 
of the last object seen from each agent p under the sorted ac- 
cess. This value is denoted t p . It computes the threshold 
value r = ti + ...Tp in each step. As soon as the leader has 
seen A objects that each have a score of at least t, the al- 
gorithm terminates. The leader then disseminates the list of 
top-A objects and their scores. 

An example execution of TA is given in Figure [T] Note 
that, while each list has 10 objects, the algorithm only re- 
quires five sum computations to find the top two objects. 

3.2. Distributed Computation of G 

The computation of G is equivalent to solving a top- A prob- 
lem over the vectors z P , p = 1 . . . P. In D-IHT, we solve 
this top-A problem using a modified version of TA that is 
not leader-based and that accommodates both negative and 
non-negative values. We describe our modifications and the 
resulting algorithm below. 

Support for non-negative values. As it was originally pro- 
posed, TA is applicable only to score functions like sum that 
are monotonic in the object values. To compute G, we need to 
find the top-A magnitude sums, which means that the score 
function is not monotonic unless all values are non-negative. 
A simple way to address this limitation is to run two instances 
of TA to find the top-A largest sums and the top-A' smallest 
sums (since a sum with a negative value may have a large 



magnitude). The top-K magnitude objects can then be found 
from this set of 2K objects. We implement a more efficient 
algorithm in which the agents find the top-i-T magnitude sums 
in a single algorithm instance by processing the sorted lists 
from both the top and the bottom. 

Decentralized list processing. We speed up the execution 
of TA by having each agent independently processes its own 
list in sorted order. Each agent initiates a group sum com- 
putation for each new object it encounters, so that P group 
sum computations are executed in parallel for each iteration 
of the algorithm. In a group sum computation, every agent 
learns the sum of the values for the specified object. There 
are many distributed algorithms for group sum computation 
(e.g. lfT7l[T8ll ). We use an algorithm based on the well-known 
broadcast-convergecast paradigm (see (19]). A request mes- 
sage is propagated down a broadcast tree that is rooted at the 
initiating agent. Each agent collects values from its children 
(if it has any), sums up these values with its own, and sends 
the result to its parent. For a network with E edges, the al- 
gorithm requires a preprocessing phase of less than 2E mes- 
sages to create a broadcast tree. Each group sum computation 
requires 3(P — 1) messages, and each agent p sends at most 
3D p messages, where D p is the node degree. 

The algorithm. We call our modified version of TA the Dis- 
tributed Absolute Threshold Algorithm (DATA). We briefly 
summarize the algorithm below. The pseudocode is given in 
the appendix. 

In DATA, Each agent stores two variables for global 
thresholds, one for sorted access from the top of the lists, 
denoted t p , and one for sorted access from the bottom of the 
lists, denoted t p . Both values are initially oo. Each agent 
also stores a top-K list that is initially empty. The agents 
each execute the following steps. 

1. If T p > t p , select next object from top of list for which 
p has not received a sum in a previous iteration. Else, select 
next object from bottom of list for which p has not received a 
sum in a previous iteration 

2. Initiate group sum computation for selected object. Agent 
1 also initiates a group sum computation for the global thresh- 
old, either T p or t p , depending on whether it accessed its ob- 
ject from the top or bottom of its list. When participating in 
the group sum computation for r p , an agent uses the most re- 
cent value seen in sorted access from the top of its list, and for 
t p , it uses the most recent value seen in sorted access from the 
bottom. 

3. On receipt of sums (from group computation) from all 
agents, update top-K list with objects that have sums with the 
K largest magnitudes seen so far. 

4. On receipt of threshold (from group computation), up- 
date appropriate global threshold variable, t p or t p . If 
the top-K list contains K objects with magnitudes at least 
max(|r p |, |r p |), return the list. Else, go to Step 1. 



Table 1: Recovery problem parameters. 
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20 
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Sparco 11 


1024 


256 
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1000 


200 


50 
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0.99 



We note that in a single iteration, multiple agents may initiate 
group sum computations for the same object. While, in the- 
ory, this introduces unnecessary message overhead, in prac- 
tice the redundant sums do not add significantly to the total 
message cost. 

3.3. The D-IHT Algorithm and Analysis 

We combine the local computation step with DATA to arrive 
at the full D-IHT algorithm. The pseudocode is given in Al- 
gorithm [T] We note that while agent 1 plays a unique role in 
the local and global computations, it performs the same num- 
ber and types of computations and sends the same number of 
messages as any other agent. 

Storage complexity. Both D-IHT and distributed basis pur- 
suit 10 17]|8) require O(N) storage per agent. 

Message complexity. Let T\ be the number of iterations of 
D-IHT required to achieve a certain accuracy. Let Sj be the 
number of group sum computations for iteration j (includ- 
ing the group threshold computations). Each sum computa- 
tion requires 3(P — 1) messages. Therefore, the total num- 
ber of messages is 3(P — 1) Y^jLi Sj- We compare this to 
distributed basis pursuit where, in every iteration, each agent 
sends its estimate of x to all of its neighbors. Assuming that 
the message size is limited to a single value, N messages are 
required to send a single estimate. Let T2 be the number of 
iterations of distributed basis pursuit required to achieve the 
same accuracy as Ti iterations of D-IHT. The total number 
of messages sent in distributed basis pursuit is 2NET2. For 
a connected network, E > P — 1, and therefore, the total 
number of messages is at least 2NT2(P — 1). 

The preprocessing phase of D-IHT requires at most 2EP 
messages, which is less than the number of messages re- 
quired for one iteration of distributed basis pursuit. There- 
fore, if T\ < T2, then D-IHT requires fewer messages than 
distributed basis pursuit so long as less than |iV sums are 
computed per iteration of D-IHT, on average. In our evalu- 
ations, T\ is always at least one order of magnitude smaller 
than T2, and in most cases, the average number of sums 
computed per iteration of D-IHT is far fewer than | N. 

4. NUMERICAL RESULTS 

In this section, we present an experimental comparison of 
D-IHT and distributed basis pursuit. As a representative ex- 



Table 2: 
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5.17 x 10 4 6.88 x 10 5 



ample, we select D-ADMM, a distributed implementation of 
the alternating direction method of multipliers that has been 
shown to outperform other distributed basis pursuit algo- 
rithms in terms of the number of communications in similar 
experiments [8|. In each iteration of D-ADMM, each agent 
exchanges its estimate with its neighbors and generates a new 
estimate by solving a local optimization problem involving its 
estimate and the estimates of some of its neighbors. We have 
implemented D-IHT and D-ADMM in Matlab, using CVX 
ll20l to solve the local optimization problems in D-ADMM. 
D-ADMM requires a graph coloring, which we generate us- 
ing the heuristic from the Matgraph toolbox [21], as is done 
in (SJ. We include the preprocessing phase in our results for 
D-IHT, but we do not include graph coloring pre-processing 
in our results for D-ADMM. 

Recovery problems. We evaluate the performance of D-IHT 
and D-ADMM on four reconstruction problems, similar to 
those in J8). F° r the first problem, we generate the A ma- 
trix with i.i.d Gaussian entries with zero mean and variance 
of 1/m. The remaining three problems are from the Sparco 
toolbox ll22l . The parameters for each problem are given in 
Table [T] For each problem, we divide the A matrix evenly 
among the agents so that each agent has M/P rows. For the 
randomly generated problem, we find the optimal sparse so- 
lution x using CVX. For the Sparco problems, we use the 
provided optimal sparse solution. 

Performance measures. For each algorithm, we measure the 
total number of messages sent in order for \x v t — :r||/||:r|| < 
10~ 2 for all agents. To standardize the bandwidth comparison 
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Fig. 2: Number of sums computed per iteration by D-IHT to 
solve Sparco problem 7 (with N — 2560) in a 40 node ER 
graph with connection probability of 0.25. 



between the algorithms, we assume that only one value is sent 
per message. Therefore, in D-ADMM, when an agent sends 
its A^-vector to its neighbor, this requires N messages. D-IHT 
is designed so that only one value is sent per message. We also 
measure the time required for convergence in a synchronous 
network where each message is delivered in one clock tick. 
For both algorithms, we only allow one message to be sent on 
a link in each direction per clock tick. 

Results. Table |2] shows the results of our evaluations in three 
different network topologies. The first is an Erdos-Renyi (ER) 
graph [23 1 where each pair of vertices is connected with prob- 
ability 0.25 (Figure |2a]i, and the second is an ER graph where 
each pair of vertices is connected with probability 0.75 (Fig- 
ure 



2b |. The third network topology is a geometric graph | 
with vertices placed uniformly at random in a unit square, and 
two vertices are connected if they are within a distance of 0.5 
of each other (Figure [2cj). 

These results show that, for every recovery problem, D- 
IHT requires far fewer total messages than D-ADMM to 
achieve the same recovery accuracy, between one and two 
orders of magnitude in most cases. D-IHT also requires less 
total time to perform the recovery than does D-ADMM. We 
note that, as network connectivity increases, in D-ADMM 
the total message count and total time increase (Table 2a vs. 



Table 2b I. In D-IHT, sums can be computed more quickly in 
networks that are more connected. Therefore, in D-IHT, the 
recovery time decreases as network connectivity increases. 

A key to the good performance of D-IHT is that, after just 
a few iterations, the algorithm finds the correct support set 
(the non-sparse components of the signal). The magnitudes of 
the values in the support set quickly dominate the other values 
in the intermediate vectors. As a result, in DATA, the sums 
for these objects are computed first, and the top-K objects 
are identified after a minimal number of sum computations 
(on the order of PK). This behavior is illustrated in Figure|2] 
where we show the total number of sums computed for each 
iteration of D-IHT for a single experiment. This figure shows 
a dramatic drop in the number of sum computations after just 
four iterations. 
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A. PSEUDOCODE FOR DATA 



Algorithm 2: Distributed Absolute Threshold Algo- 
rithm, as it is executed by each node p. 

9 function DATA(L P = {(ndx, val)} 1 ^^ 



10 topKList <s— 0, top <— 1, bottom <— N 

11 T P 4— OO, T P 4— OO 

12 done «— FALSE 

13 while done — false do 

14 if t p >t p then 

is oid <— new object id from top of list 

16 else 

17 oid new object id from bottom of list 

18 GroupComputeSum(oid) 

19 if p — 1 then 

20 GroupComputeThreshold(oid) 

21 Receive (ndx q , sum q ) for q = 1 . . . P and receive 
threshold 

22 if t p >t p then 

23 t p 4— threshold 

24 else 

25 t p <— threshold 

26 for q = 1 to P do 

27 if \topKList\ < K then 

28 Add (ndx q , sum q ) \a topKList 

29 else if sum q > mm afe. sum in topKList then 

30 Replace smallest magnitude element with 

(ndx 9 , sum 9 ) 

31 if mm. afa. sum in topKList > max(r p ,r p ) then 

32 done «— true 

33 x ^— GenerateVectorFromList(topA'List) 

34 return x 



